Distributed Algorithms for the Transitive Closure

نویسنده

  • Eric Gribkoff
چکیده

Many database queries, such as reachability and regular path queries, can be reduced to finding the transitive closure of the underlying graph. For calculating the transitive closure of large graphs, a distributed computation framework is required to handle the large data volume (which can approach O(|V |) space). Map Reduce was not originally designed for recursive computations, but recent work has sought to expand its capabilities in this regard. Distributed, recursive evaluation of the transitive closure faces two main challenges in the Map Reduce environment: a large number of rounds may be required to find all tuples in the transitive closure on large graphs, and the amount of duplicate tuples derived may incur a data volume cost far greater than the size of the transitive closure itself. Seminaive and smart are two algorithms for computing the transitive closure which make different choices in handling the tradeoffs between these two problems. Recent work suggests that smart may be superior to seminaive in the Map Reduce paradigm, based on an analysis of the algorithms but without an accompanying implementation. This view diverges from earlier work which found seminaive to be superior in the majority of cases, albeit in a non-distributed environment. This paper presents implementations of seminaive and smart transitive closure built upon the Hadoop framework for distributed computation and compares the performance of the algorithms across a broad class of data sets, including trees, acyclic, and cyclic graphs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution

This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...

متن کامل

On the PVM Computations of Transitive Closure and Algebraic Path Problems

We investigate experimentally, alternative approaches to the distributed parallel computation of a class of problems related to the generic transitive closure problem and the algebraic path problem. Our main result is the comparison of two parallel algorithms for transitive closure, { a straightforward coarse-grained parallel implementation of the Warshall algorithm named Block-Processing (whic...

متن کامل

Data fragmentation for parallel transitive closure strategies

A topic that is currently inspiring a lot of research is parallel (distributed) computation of transitive closure queries. In 10] the disconnection set approach has been introduced as an eeective strategy for such a computation. It involves reformulating a transitive closure query on a relation into a number of transitive closure queries on smaller fragments; these queries can then execute inde...

متن کامل

Data fragmentation for parallel transitive closure strategies - Data Engineering, 1993. Proceedings., Ninth International Conference on

A topic that is currently inspiring a lot of research i s parallel (distributed) computation of transitive closure queries. In [lo] the disconnection set approach has been introduced as a n effective strategy f o r such a computation. It involves reformulating a transitive closure query o n a relation into a number of transitive closure queries o n smaller fragments; these queries can then exec...

متن کامل

Eecient Transitive Closure Computation

We present two new transitive closure algorithms that are based on strong component detection. The algorithms scan the input graph only once without generating partial successor sets for each node. The new algorithms eliminate the redundancy caused by strong components more e ciently than previous transitive closure algorithms. We present statistically sound simulation experiments showing that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013